20 research outputs found
Guided Frequency Loss for Image Restoration
Image Restoration has seen remarkable progress in recent years. Many
generative models have been adapted to tackle the known restoration cases of
images. However, the interest in benefiting from the frequency domain is not
well explored despite its major factor in these particular cases of image
synthesis. In this study, we propose the Guided Frequency Loss (GFL), which
helps the model to learn in a balanced way the image's frequency content
alongside the spatial content. It aggregates three major components that work
in parallel to enhance learning efficiency; a Charbonnier component, a
Laplacian Pyramid component, and a Gradual Frequency component. We tested GFL
on the Super Resolution and the Denoising tasks. We used three different
datasets and three different architectures for each of them. We found that the
GFL loss improved the PSNR metric in most implemented experiments. Also, it
improved the training of the Super Resolution models in both SwinIR and SRGAN.
In addition, the utility of the GFL loss increased better on constrained data
due to the less stochasticity in the high frequencies' components among
samples
Streamlined Global and Local Features Combinator (SGLC) for High Resolution Image Dehazing
Image Dehazing aims to remove atmospheric fog or haze from an image. Although
the Dehazing models have evolved a lot in recent years, few have precisely
tackled the problem of High-Resolution hazy images. For this kind of image, the
model needs to work on a downscaled version of the image or on cropped patches
from it. In both cases, the accuracy will drop. This is primarily due to the
inherent failure to combine global and local features when the image size
increases. The Dehazing model requires global features to understand the
general scene peculiarities and the local features to work better with fine and
pixel details. In this study, we propose the Streamlined Global and Local
Features Combinator (SGLC) to solve these issues and to optimize the
application of any Dehazing model to High-Resolution images. The SGLC contains
two successive blocks. The first is the Global Features Generator (GFG) which
generates the first version of the Dehazed image containing strong global
features. The second block is the Local Features Enhancer (LFE) which improves
the local feature details inside the previously generated image. When tested on
the Uformer architecture for Dehazing, SGLC increased the PSNR metric by a
significant margin. Any other model can be incorporated inside the SGLC process
to improve its efficiency on High-Resolution input data.Comment: Accepted in CVPR 2023 Workshop
License Plate Super-Resolution Using Diffusion Models
In surveillance, accurately recognizing license plates is hindered by their
often low quality and small dimensions, compromising recognition precision.
Despite advancements in AI-based image super-resolution, methods like
Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs)
still fall short in enhancing license plate images. This study leverages the
cutting-edge diffusion model, which has consistently outperformed other deep
learning techniques in image restoration. By training this model using a
curated dataset of Saudi license plates, both in low and high resolutions, we
discovered the diffusion model's superior efficacy. The method achieves a
12.55\% and 37.32% improvement in Peak Signal-to-Noise Ratio (PSNR) over SwinIR
and ESRGAN, respectively. Moreover, our method surpasses these techniques in
terms of Structural Similarity Index (SSIM), registering a 4.89% and 17.66%
improvement over SwinIR and ESRGAN, respectively. Furthermore, 92% of human
evaluators preferred our images over those from other algorithms. In essence,
this research presents a pioneering solution for license plate
super-resolution, with tangible potential for surveillance systems
Unsupervised Domain Adaptation Using Generative Adversarial Networks for Semantic Segmentation of Aerial Images
Segmenting aerial images is of great potential in surveillance and scene understanding of urban areas. It provides a mean for automatic reporting of the different events that happen in inhabited areas. This remarkably promotes public safety and traffic management applications. After the wide adoption of convolutional neural networks methods, the accuracy of semantic segmentation algorithms could easily surpass 80% if a robust dataset is provided. Despite this success, the deployment of a pretrained segmentation model to survey a new city that is not included in the training set significantly decreases accuracy. This is due to the domain shift between the source dataset on which the model is trained and the new target domain of the new city images. In this paper, we address this issue and consider the challenge of domain adaptation in semantic segmentation of aerial images. We designed an algorithm that reduces the domain shift impact using generative adversarial networks (GANs). In the experiments, we tested the proposed methodology on the International Society for Photogrammetry and Remote Sensing (ISPRS) semantic segmentation dataset and found that our method improves overall accuracy from 35% to 52% when passing from the Potsdam domain (considered as source domain) to the Vaihingen domain (considered as target domain). In addition, the method allows efficiently recovering the inverted classes due to sensor variation. In particular, it improves the average segmentation accuracy of the inverted classes due to sensor variation from 14% to 61%.info:eu-repo/semantics/publishedVersio
Drone deep reinforcement learning: A review
Unmanned Aerial Vehicles (UAVs) are increasingly being used in many challenging and diversified applications. These applications belong to the civilian and the military fields. To name a few; infrastructure inspection, traffic patrolling, remote sensing, mapping, surveillance, rescuing humans and animals, environment monitoring, and Intelligence, Surveillance, Target Acquisition, and Reconnaissance (ISTAR) operations. However, the use of UAVs in these applications needs a substantial level of autonomy. In other words, UAVs should have the ability to accomplish planned missions in unexpected situations without requiring human intervention. To ensure this level of autonomy, many artificial intelligence algorithms were designed. These algorithms targeted the guidance, navigation, and control (GNC) of UAVs. In this paper, we described the state of the art of one subset of these algorithms: the deep reinforcement learning (DRL) techniques. We made a detailed description of them, and we deduced the current limitations in this area. We noted that most of these DRL methods were designed to ensure stable and smooth UAV navigation by training computer-simulated environments. We realized that further research efforts are needed to address the challenges that restrain their deployment in real-life scenarios
A Machine Learning Approach Involving Functional Connectivity Features to Classify Rest-EEG Psychogenic Non-Epileptic Seizures from Healthy Controls
Until now, clinicians are not able to evaluate the Psychogenic Non-Epileptic Seizures (PNES) from the rest-electroencephalography (EEG) readout. No EEG marker can help differentiate PNES cases from healthy subjects. In this paper, we have investigated the power spectrum density (PSD), in resting-state EEGs, to evaluate the abnormalities in PNES affected brains. Additionally, we have used functional connectivity tools, such as phase lag index (PLI), and graph-derived metrics to better observe the integration of distributed information of regular and synchronized multi-scale communication within and across inter-regional brain areas. We proved the utility of our method after enrolling a cohort study of 20 age- and gender-matched PNES and 19 healthy control (HC) subjects. In this work, three classification models, namely support vector machine (SVM), linear discriminant analysis (LDA), and Multilayer perceptron (MLP), have been employed to model the relationship between the functional connectivity features (rest-HC versus rest-PNES). The best performance for the discrimination of participants was obtained using the MLP classifier, reporting a precision of 85.73%, a recall of 86.57%, an F1-score of 78.98%, and, finally, an accuracy of 91.02%. In conclusion, our results hypothesized two main aspects. The first is an intrinsic organization of functional brain networks that reflects a dysfunctional level of integration across brain regions, which can provide new insights into the pathophysiological mechanisms of PNES. The second is that functional connectivity features and MLP could be a promising method to classify rest-EEG data of PNES form healthy controls subjects
Prediction of Arabic Legal Rulings Using Large Language Models
In the intricate field of legal studies, the analysis of court decisions is a cornerstone for the effective functioning of the judicial system. The ability to predict court outcomes helps judges during the decision-making process and equips lawyers with invaluable insights, enhancing their strategic approaches to cases. Despite its significance, the domain of Arabic court analysis remains under-explored. This paper pioneers a comprehensive predictive analysis of Arabic court decisions on a dataset of 10,813 commercial court real cases, leveraging the advanced capabilities of the current state-of-the-art large language models. Through a systematic exploration, we evaluate three prevalent foundational models (LLaMA-7b, JAIS-13b, and GPT-3.5-turbo) and three training paradigms: zero-shot, one-shot, and tailored fine-tuning. In addition, we assess the benefit of summarizing and/or translating the original Arabic input texts. This leads to a spectrum of 14 model variants, for which we offer a granular performance assessment with a series of different metrics (human assessment, GPT evaluation, ROUGE, and BLEU scores). We show that all variants of LLaMA models yield limited performance, whereas GPT-3.5-based models outperform all other models by a wide margin, surpassing the average score of the dedicated Arabic-centric JAIS model by 50%. Furthermore, we show that all scores except human evaluation are inconsistent and unreliable for assessing the performance of large language models on court decision predictions. This study paves the way for future research, bridging the gap between computational linguistics and Arabic legal analytics